Clinical Chemistry
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match Clinical Chemistry's content profile, based on 22 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Bharne, D.; Gaston, D.
Show abstract
One of the current workhorses of next-generation sequencing in clinical molecular diagnostics laboratories for profiling somatic mutations in tumours are amplicon-based targeted sequencing panels. Many open-source somatic variant callers are available; however, their use in clinical applications remains under explored. Therefore, we integrated outputs of six variant callers (FreeBayes, MuTect2, Pisces, Platypus, VarDict and VarScan) into a Snakemake pipeline and evaluated tumour-only data from the HD789 commercial reference standard sequenced in triplicate on three different sequencing runs using the Illumina AmpliSeq Focus panel on MiSeq and NextSeq 2000. A 1:4 dilution sample was sequenced for evaluating limits of variant detection. The called variants were analysed along depth, allele frequency, and other sequencing metrics. The variant callers were evaluated by their level of concordance and performance on known somatic variants. FreeBayes consistently called the largest number of somatic variants in each sample but also included more potential artifacts. Overall, FreeBayes, VarScan, MuTect2, and Pisces had the best performance on HD789 data.
Halldorsson, S.; Nagymihaly, R. M.; Bope, C. D.; Lund-Iversen, M.; Niehusmann, P.; Lien-Dahl, T.; Pahnke, J.; Bruning, T.; Kongelf, G.; Patel, A.; Sahm, F.; Euskirchen, P.; Leske, H.; Vik-Mo, E. O.
Show abstract
Background: Classification of central nervous system (CNS) tumors has become increasingly complex, raising concerns about the sustainability of comprehensive molecular diagnostics. We have evaluated nanopore whole genome sequencing (nWGS) as a single workflow to replace multiple diagnostic assays. Methods: We performed nWGS on DNA extracted from 90 adult CNS tumor samples (58 retrospective, 32 prospective) and compared the results to findings from standard of care (SoC) diagnostic work-up. Analysis was done through an automated workflow that consolidated diagnostically and therapeutically relevant genomic alterations, including copy-number variation, structural, and single-nucleotide variants, chromosomal aberrations, gene fusions, and methylation-based classification. Results: nWGS supported final diagnostic classification in all samples with >15% tumor cell content, requiring ~3 hours of hands-on library preparation, parallel sample processing, and sequencing times within 72 hours. Methylation-based classification was available within 1 hour and was concordant with the integrated final diagnosis in 89% of cases (80/90). All diagnostically relevant copy-number variations, single-nucleotide variants, and gene fusions were concordant with SoC testing. MGMT promoter methylation status matched in 94% of cases. In addition, nWGS identified prognostic and potentially actionable variants that were not reported or covered by SoC. Conclusions: nWGS delivers comprehensive genetic and epigenetic results with a fast turn-around compared to standard methods. This enables efficient, accurate, and scalable molecular diagnostics of CNS tumors using a single platform. This data supports its implementation in routine clinical practice and may be extended to other cancer types requiring complex genomic profiling.
Powell, S.; Bui, T.; Gullipalli, D.; LaCava, M.; Jones, S. M.; Hansen, T.; Kuhr, F.; Swat, W.; Simandi, Z.
Show abstract
Current clinical management of multiple myeloma (MM) relies on bone marrow (BM) biopsies for minimal residual disease (MRD) assessment. While BM biopsies are the gold standard, their invasive nature and potential to miss extramedullary or patchy disease necessitate sensitive, non-invasive liquid biopsy platforms. In this study, we evaluated the analytical performance of the CellSearch CMMC assay to determine its utility for deep-MRD monitoring. Using a standard 4 mL whole blood input, the assay achieves a WBC-normalized sensitivity of 2.45 x 10-7, supported by a limit of quantitation of 5 cells per run. Given this high analytical sensitivity, the assay provides a robust negative predictive value, rendering false-negative findings highly unlikely in populations with detectable peripheral disease. These findings characterize the CellSearch CMMC assay as a highly sensitive, analytically validated platform for non-invasive deep-MRD level longitudinal surveillance monitoring. When integrated into a clinical workflow that accounts for its specificity profile, the platform offers a patient-friendly complement to serial BM biopsies, with the potential to reduce their frequency in appropriate clinical contexts.
Sines, B. J.; Hagan, R. S.; Jiang, X.; Pavlechko, E.; McClain, S.; Hunt, X.; Florou-Moreno, J.; Acquardo, J.; Risa, G.; Valsaraj, V.; Schisler, J. C.; Wolfgang, M. C.
Show abstract
Objective: To develop a workflow that transforms electronic health record data into machine learning-ready features for molecular endotype assignment and to evaluate whether clinician-informed feature engineering improves model performance and interpretability. Materials and Methods: We developed parallel clinician-informed and clinician-agnostic feature engineering pipelines to prepare raw EHR data from mechanically ventilated patients with respiratory failure. Molecular endotype labels derived from paired deep lung and blood profiling of subjects with acute lung injury were used to train candidate machine learning classifiers. Champion models from each pipeline were compared on predefined performance metrics. Results: Bayesian network classifiers were the top-performing models in both pipelines. The clinician-informed pipeline generated fewer features than the clinician-agnostic pipeline (645 vs 1,127) and produced a lower misclassification rate in the final Bayesian network model (0.047 vs 0.14). In an independent cohort of subjects with acute lung injury, the clinician-informed model better distinguished corticosteroid-responsive from non-responsive subgroups. Discussion: Clinical context improved feature engineering efficiency, model interpretability, and classification performance. These findings support the integration of domain expertise into machine learning workflows intended for critical care implementation. Conclusions: Clinician-informed feature engineering can simplify machine learning models while improving performance and preserving clinical relevance. AI tools developed for healthcare should incorporate subject matter expertise early in the feature engineering and analytic workflow.
Guerrero Quiles, C.; Lodhi, T.; Sellers, R.; Sahoo, S.; Weightman, J.; Breitwieser, W.; Sanchez Martinez, D.; Bartak, M.; Shamim, A.; Lyons, S.; Reeves, K.; Reed, R.; Hoskin, P.; West, C.; Forker, L.; Smith, T.; Bristow, R.; Wedge, D. C.; Choudhury, A.; Biolatti, L. V.
Show abstract
Whole-genome sequencing (WGS) enables comprehensive analysis of tumour genomes, but its use in formalin-fixed paraffin-embedded (FFPE) samples is limited by DNA fragmentation and low yields. Whole-genome amplification (WGA) methods such as multiple displacement amplification (MDA) can boost DNA availability but distort copy-number alteration (CNA) profiles. DNA ligation-mediated MDA (DLMDA) mitigates this bias by reconstituting fragmented templates, yet its performance in FFPE-derived DNA remains uncertain. We compared paired DLMDA pre-amplified (2h, 8h) and non-pre-amplified FFPE prostate tumour samples from 22 archival blocks (5, 15 and 20 years old). DLMDA increased DNA yield by 42- to 86-fold, with global CNA patterns largely preserved. However, DLMDA significantly reduced the number of detected CNA deletions and amplifications. These effects were independent of both block age and reaction time. CNA dropouts were randomly distributed across the genome, indicating that DLMDA does not introduce regional bias. Our results show that DLMDA enables robust DNA yield recovery and avoids false-positive CNA artefacts, but at the cost of reduced CNA sensitivity. While suitable for CNA screening pipelines through WGS, further improvements are required to minimise the false-negative risk and improve the techniques sensitivity for FFPE-based genomics.
Warner, B. E.; Patel, J.; Satterwhite, R.; Wang, R.; Adams-Haduch, J.; Koh, W.-P.; Yuan, J.-M.; Shair, K. H. Y.
Show abstract
PurposeAntibodies to Epstein-Barr virus (EBV) proteins can predict nasopharyngeal carcinoma (NPC) risk. We previously defined a prototype EBNA1 protein panel and multiplex immunoblot assay that distinguishes NPC risk several years pre-diagnosis. Assay throughput and specificity are critical to effectively implement a population-level screening program. Here, we developed a strip test assay - EBNA1 SeroStrip-HT - with an objective to increase throughput and maximize specificity. Experimental DesignEBNA1 full-length (FL) and glycine-alanine repeat deletion mutants (dGAr) were purified from insect and mammalian cells to screen serum IgA/IgG from prospective cohorts in Singapore and Shanghai, China, with known time intervals to NPC diagnosis. Twenty pre-diagnostic sera within 4 years to diagnosis were compared to 96 healthy controls using a nested case-control study design. ResultsIgA to mammalian-derived EBNA1 dGAr achieved 85.0% sensitivity and 94.8% specificity (AUC, 0.939) for NPC status. IgA to insect-derived EBNA1 dGAr showed the same sensitivity (85.0%) and similar specificity (93.8%) (AUC, 0.941). IgA to insect-derived EBNA1 FL had a higher 90% sensitivity, but lower 91.7% specificity (AUC, 0.940). Combining EBNA1 FL and dGAr results showed that subjects positive for both proteins had a 243.67 odds ratio for NPC incidence compared to double-negative scores. ConclusionThis study demonstrated the efficacy of EBNA1 SeroStrip-HT for NPC risk assessment and stratification in high- and intermediate-risk populations, yielding high accuracy and a 12-fold increased throughput over the prototype. The insect system was appropriate for large-scale production of purified EBNA1. Larger, geographically diverse cohorts are warranted to confirm these results, especially in low-incidence populations.
Qian, K.; Abhyankar, V.; Keo, D.; Zarceno, P.; Toy, T.; Eskin, E.; Arboleda, V. A.
Show abstract
Sequencing the respiratory tract transcriptome has the potential to provide insights into infectious pathogens and the hosts immune response. While DNA-based sequencing is more standard in clinical laboratories due to its stability, RNA assays offer unique advantages. RNA reflects dynamic physiological changes, and for RNA viruses, viral RNA particles directly represent copies of the viral genome, enabling greater diagnostic sensitivity. However, RNAs susceptibility to degradation remains a significant challenge, particularly in RNase-rich specimens like saliva. To address this, we conducted a systematic, combinatorial evaluation of 24 distinct mNGS workflows, crossing eight nucleic acid extraction methods with three RNA-Seq library preparation protocols. Remnant saliva samples (n = 6) were pooled and spiked with MS2 phage as a control. The SARS-CoV-2 virus was spiked into half of the samples, which were extracted using the eight different extraction methods (n = 3) and compared using RNA Integrity Number equivalent (RINe) scores and RNA concentration. The extracted RNA was then processed across the three library construction methods and subjected to short-read sequencing to assess all 24 combinations head-to-head. We compared methods based on viral read recovery and found that RINe and concentration did not correlate with viral detection. The Zymo Quick-RNA Magbead kit and the Tecan Revelo RNA-Seq High-Sensitivity RNA library kit were the extraction and library-preparation kits that yielded the most SARS-CoV-2 reads, respectively. Importantly, our combinatorial analysis revealed that any small variability attributable to different nucleic acid extraction methods was heavily overshadowed by differences in quality attributable to the RNA-Seq library preparation methods. These findings challenge the reliance on conventional RNA quality metrics for clinical metagenomics and underscore the need to redefine extraction quality standards for mNGS applications. IMPORTANCEmNGS is a powerful and unbiased approach towards pathogen detection that has mostly been applied to blood and cerebrospinal fluid samples. However mNGS has recently been applied to more areas including the respiratory pathogen detection space, with potential applications in both in-patient diagnostics and public health surveillance. Saliva samples are an ideal sample type for these use cases since they can be collected non-invasively. However, saliva is also a challenging sample type due to its high RNase activity and often yields low-quality nucleic acid. This study explores the feasibility of using saliva specimens in mNGS with contrived SARS-CoV-2 samples to optimize the combination of two factors: nucleic acid extraction and RNA-seq library preparation. Exploration in this area could enhance the sensitivity of saliva-based mNGS assays, with the goal of future expansion of this specimen type in clinical diagnostics and public health surveillance. Key PointsO_LIThe choice of RNA-Seq library preparation kit has a greater impact on pathogen detection than the nucleic acid extraction method. C_LIO_LIThe combination of Zymo Quick-RNA Magbead extraction kit and TECAN Revelo RNA-Seq High Sensitivity RNA library kit recovered the highest percentage of total SARS-CoV-2 reads. C_LIO_LIRNA quantity and RINe score do not correlate with viral read capture, indicating a need for an alternative metric to assess RNA quality for downstream mNGS clinical diagnostics. C_LI
Kamhieh-Milz, J.; Kamhieh-Milz, S.; Schwarz, F.; Michel, J.; Nitsche, A.; Puyskens, A.
Show abstract
Mpox poses an ongoing global public health threat, with case numbers rising beyond traditionally endemic regions in Central and Western Africa. Rapid detection of the causative agent, the Monkeypox virus (MPXV), is critical for outbreak control, yet laboratory infrastructure and trained personnel remain scarce in many affected areas. Point-of-care molecular diagnostics offer a practical solution by enabling timely testing without specialized equipment or elaborate nucleic acid extraction. We evaluated the performance of an extraction-free RNase HII-assisted amplification (RHAM) assay for MPXV detection by Pluslife Biotech, a novel isothermal amplification technology providing results in under 30 minutes. The Pluslife RHAM test demonstrated pan-MPXV clade reactivity, detecting all four MPXV clades (Ia, Ib, IIa, IIb) with high analytical sensitivity and no cross-reactivity to other poxviruses or other clinically relevant pathogens. The assay proved compatible with diverse clinical specimen types, including lesion swabs, oropharyngeal swabs, rectal swabs, urine, semen, and wound exudate. As part of routine diagnostics at the German Consultant Laboratory for Poxviruses, in a comprehensive evaluation of 206 clinical specimens against diagnostic real-time PCR, the Pluslife RHAM test achieved a diagnostic sensitivity of 94.2% (95% CI: 85.8-98.4%) and a specificity of 100% (95% CI: 97.3-100%). Notably, samples with higher viral loads (Ct <30) showed 100% sensitivity. Time-to-result correlated significantly with viral load, enabling faster diagnosis in high-viral-load cases. The Pluslife RHAM test represents a practical, sensitive, and rapid point-of-care solution for MPXV detection in resource-limited settings, combining strong analytical performance with operational simplicity to support timely outbreak response and clinical decision-making.
Korni, A.; Zandi, E.
Show abstract
BackgroundPlasma biomarkers demonstrate strong within-cohort performance for identifying cerebral amyloid pathology, but their real-world clinical utility depends on generalization across populations and assay platforms. The impact of cross-cohort deployment on clinically actionable metrics such as negative predictive value (NPV) remains poorly characterized. ObjectiveTo evaluate the performance and portability of plasma biomarker-based machine learning models for amyloid PET prediction across independent cohorts, with emphasis on calibration and clinically relevant predictive values. MethodsData from ADNI (n=885) and A4 (n=822) were analyzed. Machine learning models were trained within each cohort to predict amyloid PET status and continuous amyloid burden (centiloids). Performance was assessed using ROC AUC, accuracy, R{superscript 2}, and RMSE. Cross-cohort generalizability was evaluated using bidirectional transfer without retraining. Calibration, predictive values, and decision curve analysis were used to assess clinical utility. ResultsWithin-cohort discrimination was high (AUC up to 0.913 in ADNI and 0.870 in A4), with moderate performance for centiloid prediction (R{superscript 2} up to 0.628 and 0.535, respectively). Cross-cohort deployment resulted in modest attenuation of AUC ([~]4-7%) but substantially greater degradation in clinically actionable performance. NPV declined from 0.831 to 0.644 under ADNI[->]A4 transfer ([~]19 percentage points) despite preserved discrimination. Calibration analyses demonstrated systematic probability misestimation, and decision curve analysis showed reduced net clinical benefit. Biomarker distribution differences across cohorts were consistent with dataset shift. ConclusionPlasma biomarker models retain discrimination across cohorts but exhibit clinically meaningful degradation in predictive value under deployment. Calibration instability and prevalence differences critically affect NPV, highlighting the need for cross-cohort validation, calibration assessment, and assay harmonization before clinical implementation.
rani, a.; mishra, s.
Show abstract
Accurate histopathological differentiation between High-Grade Serous Carcinoma (HGSC) and Low-Grade Serous Carcinoma (LGSC) remains a critical yet challenging aspect of ovarian cancer diagnosis due to their similar morphology and different clinical outcomes. This study presents a deep learning framework that uses custom attention mechanisms, including the Convolutional Block Attention Module (CBAM), Squeeze-and-Excitation (SE) blocks, and a Differential Attention module within five CNN architectures for automated binary classification of ovarian cancer subtypes from H&E WSI patches. Although individual models achieved higher accuracy, the ensemble stacking framework with a shallow MLP meta-learner delivered the best overall performance, with a ROC-AUC of 0.9211, an accuracy of 0.85, and F1-scores of 0.84 and 0.85 across both subtypes. These findings demonstrate that attention-guided feature recalibration combined with ensemble stacking provides robust and clinically interpretable discrimination of ovarian carcinoma subtypes.
James-Pemberton, P.; Harper, D.; Wagerfield, P.; Watson, C.; Hervada, L.; Kohli, S.; Alder, S.; Shaw, A.
Show abstract
A multiplex diagnostic test is evaluated for self-reported long COVID associated persistent symptoms and a poor recovery from a SARS-CoV-2 infection. A mass-standardised concentration of total antibodies (AC), high-quality (HQ) antibodies and percentage of HQ antibodies (HQ%) is assessed against a spectrum of spike proteins to the SARS-CoV-2 variants: Wuhan, , {delta}, and the Omicron variants BA.1, BA.2, BA.2.12.1, BA.2.75, BA.5, CH.1.1, BQ.1.1 and XBB.1.5 in three cohorts. A cohort of control patients (n = 46) recovered (CC) and a cohort of self-declared long COVID patients (n = 113) (LCC). A nested Receiver Operating Characteristic (ROC) analysis, performed for the variant with lowest HQ concentration in the spectrum, produced an area under the curve and AUC = 0.61 (0.53-0.70) for the CC vs LCC cohorts. For the LCC cohort, the cut-off thresholds for AC = 0.8 mg/L, HQ = 1.5 mg/L and HQ% of 34% were determined, leading to a 71% sensitivity and 66% specificity derived by the Youden metric. The cohorts may be fully classified based on ROC and outlier analysis to give an incidence of persistent virus 62% (95% CI 52% - 71%), hyperimmune 12% (95% CI 7% - 20%) and unclassified, 26% (95% CI 18% - 35%). The overall diagnostic accuracy for both the hyper and hypo immune is 69%. All clinical interventions can now be tailored for the heterogenous long COVID patient cohort.
Ahn, J.; Zack, D.; Zhang, P.
Show abstract
Accurate detection of RNA splice variants is often hindered when transcripts lack large distinguishable exonic regions, making conventional PCR strategies challenging. We developed a simple melting temperature (Tm)-guided exon-exon junction (EEJ) RT-PCR method to enable variant-specific detection under these conditions. Uni-directional primers spanning exon-exon junctions were designed so that approximately each half anneals to adjacent exons. The Tm of each half-site was set >7{degrees}C below the annealing temperature, preventing stable binding to individual exons and enforcing junction-dependent amplification. The method was evaluated using HTRA1-AS1 long noncoding RNA variants that share overlapping exon sequences but differ in splice connectivity. HTRA1-AS1 comprises five variants, only one with a large distinguishable exon. Tm-guided EEJ primers robustly discriminated the remaining four variants. After optimization, amplification yielded sharp, single bands with minimal cross-reactivity. Compared with conventional designs, this approach reduced heteroduplex and heteroquadruplex formation, improving band clarity. Sanger sequencing confirmed junction specificity, and the method performed well in multiplex settings. Overall, Tm-guided EEJ RT-PCR is a cost-effective, high-resolution approach for detecting RNA variants lacking easily distinguishable exonic regions, readily compatible with standard RT-PCR and qPCR workflows.
Strasser, B.; Mustafa, S.; Holly, M.; Grünberger, M.; Anita, S.
Show abstract
Background: External Quality Assurance (EQA) is an essential component of modern laboratory medicine. Current scientific evidence on EQA focuses primarily on the analyses carried out by EQA providers while relatively little research has been conducted in individual clinical laboratories. Methods: In this retrospective single-center observational study in a clinical laboratory, EQA results were analyzed over a period of four years (2021-2024). The evaluation was based on EQA action reports documented in the institutes internal quality management system. Deviations were classified according to department, type of discrepancy, root cause category (analytical, preanalytical, systemic, unidentifiable), and measures taken. Results: A total of 7226 EQA participations were evaluated during the observation period. The overall error rate remained consistently low, ranging between 0.8% and 1.6%, with no significant change over time (p = 0.87). Most deviations occurred in the departments of clinical chemistry and immuno/autoimmune diagnostics (p < 0.001). These were predominantly quantitative discrepancies (false low/false negative or false high/false positive). Root cause analysis showed a clear dominance of analytical causes (p < 0.001), while preanalytical and systemic causes were identified less frequently. In most cases, corrective measures, such as re-analyses, recalibrations, process adjustments, or staff training, were implemented promptly. Hard structural measures, such as changing methods or discontinuing tests, were rarely necessary. Conclusion: In a clinical laboratory, EQA is an important tool for structured error analysis and continuous quality improvement. Consistent processing of deviating EQA results goes hand in hand with stable analytical performance and a low error rate.
Brate, J.; Grande, E. G.; Pedersen, B. N.; Frengen, T. G.; Stene-Johansen, K.
Show abstract
Here we evaluated the performance of a previously published tiling PCR primer scheme by Ringlander et al. (2022) for whole-genome amplification of Hepatitis B virus (HBV) in combination with Oxford Nanopore sequencing. The primer set originally developed for Ion Torrent sequencing was adapted by removing platform-specific adapters and tested using clinical serum or plasma samples submitted for routine HBV genotyping and resistance testing. Two multiplexing strategies were compared: a single PCR pool containing all primers and a two-pool strategy with non-overlapping amplicons. Sequencing reads were processed using a Nanopore analysis pipeline, and genome coverage and amplicon performance were compared across samples spanning a wide Ct range and representing HBV genotypes A-E. Across all samples, the median genome coverage was approximately 50%, although recovery varied widely, ranging from complete failure to nearly full genomes. Combining all primers into a single PCR reaction, or separating overlapping amplicons into different reactions, had little overall impact on genome recovery, and no consistent differences between the two pooling strategies were observed. In contrast, amplification efficiency differed markedly between individual amplicons. Amplicons 1-5 generally produced higher sequencing depth, whereas amplicons 6-10 frequently showed low coverage and contributed to incomplete genome recovery. Genome coverage was strongly associated with Ct values, with higher coverage observed in samples with lower Ct values, while coverage was broadly similar across genotypes. These results demonstrate that the Ringlander et al. primer scheme can be adapted for multiplex PCR and Nanopore sequencing of HBV, but uneven amplicon performance limits consistent full-genome recovery and highlights the need for further optimization of HBV tiling PCR designs.
Chandra, S.
Show abstract
Background. Detection of cerebral amyloid pathology currently requires amyloid PET imaging ($5,000-$8,000) or cerebrospinal fluid analysis via lumbar puncture, procedures that are inaccessible for population-level screening. The FDA-cleared Lumipulse G pTau217/Abeta1-42 plasma ratio test (May 2025) represents the first approved blood-based alternative; however, single-ratio approaches cannot distinguish Alzheimer's disease (AD) from non-AD neurodegeneration or provide multi-dimensional disease characterization. Methods. We developed Virtual Spectral Decomposition (VSD), a framework that decomposes plasma biomarker profiles into biologically interpretable diagnostic channels. Four plasma biomarkers - phosphorylated tau-217 (pTau217), amyloid-beta42/40 ratio, neurofilament light chain (NfL), and glial fibrillary acidic protein (GFAP) - were measured in 1,139 Alzheimer's Disease Neuroimaging Initiative (ADNI) participants. Each biomarker was mapped to a VSD channel representing a distinct pathophysiological axis: tau/amyloid phosphorylation, amyloid clearance, neurodegeneration, and astrocytic activation. Channel weights were calibrated via logistic regression, and performance was evaluated against amyloid PET (UC Berkeley) using 10x5-fold repeated cross-validation. Results. VSD 4-channel fusion achieved AUC = 0.900 (+/-0.018), exceeding pTau217 alone (0.888+/-0.022). Optimal sensitivity was 89.7% with 78.1% specificity (NPV = 90.8%). The NfL channel received a negative weight (beta = -1.1), functioning as a disease-exclusion signal: elevated neurodegeneration without amyloid-tau coupling actively reduces the AD probability, distinguishing AD from non-AD neurodegeneration. Complementary CSF proteomics analysis (7,008 proteins, 533 participants) identified 17 amyloid-specific proteins (0.24% of the proteome), revealing a 49:1 tau-to-amyloid asymmetry that explains why blood-based tau markers outperform amyloid markers. Conclusions. Blood-based VSD provides an interpretable, multi-channel framework for amyloid detection that incorporates explicit disease-exclusion logic unavailable to single-biomarker approaches. The architecture extends to multi-disease screening, where the same blood specimen could be routed through disease-specific modules for AD, Parkinson's disease, and cancer.
Millasseau, V.; Mallet, D.; Carnicella, S.; Barbier, E. L.; Sauvee, M.; Le Gouellec, A.; Cannet, C.; Pompe, N.; Boulet, S.; Fauvelle, F.
Show abstract
Background. Parkinson's disease (PD) diagnosis remains delayed and suboptimally accurate, largely due to clinical overlap with atypical parkinsonian syndromes and the lack of reliable biomarkers. Here, we evaluated the performance of a previously patented 6-metabolites blood biomarker (6M-BB) for the differential diagnosis of PD and its translation to clinical IVDr NMR platform. Methods. Patient serum samples from de novo PD (n=30), multiple system atrophy (MSA, n=30), progressive supranuclear palsy (PSP, n=30), Alzheimer's disease (AD, n=33), and healthy individuals (n=29), were profiled by 1H NMR and classified using the 6M-BB. For clinical use, we rebuilt the model on absolute concentrations acquired on a Bruker Avance IVDr 600 MHz system. Results. The 6M-BB validation yielded 0.902 AUC and 87.9% accuracy for PD vs. HC (sensitivity 86.7%, specificity 89.3%), with an overall accuracy of 82.6% across all groups. The IVDr-based refit achieved 0.878 AUC (overall accuracy 77%). Adding VLDL-5 free cholesterol (V5FC) and citrate markedly improved performance to 0.959 AUC, with 94.9% accuracy for PD vs. HC (sensitivity 96.7%, specificity 93.1%) and 84.9% when MSA/PSP were included. Conclusion. The externally validated 6M-BB has demonstrated its robustness for the differential diagnosis of PD compared to other parkinsonian syndromes at de novo stage. Its successful transfer to a fully automated, standardized IVDr machine, with gains from V5FC and citrate, supports the feasibility and promising potential for clinical implementation, justifying future prospective multicenter studies.
Basilakis, A.; Duenser, M. W.
Show abstract
Background: The Therapeutic Distance framework (Paper 1) achieved AUC 0.61 for orbit-based mortality prediction in 11,627 sepsis patients. We hypothesised that incorporating state-dependent parameter relevance would substantially improve prediction. Methods: We extended the framework to 84,176 ICU patients from MIMIC-IV v3.1 across 16 clinical syndromes. Validation included full-population leave-one-out (n=59,362), head-to-head comparison against SAPS-II and logistic regression on 34,467 matched patients with bootstrap confidence intervals, temporal validation, outcome permutation, sensitivity analysis, and calibration assessment. Results: Full-population leave-one-out achieved AUC 0.832 (n=59,362). On 34,467 matched patients, Therapeutic Distance (AUC 0.841) significantly outperformed both SAPS-II (0.786; delta=+0.055, 95% CI +0.048 to +0.061, p<0.001) and logistic regression (0.788). Temporal validation showed stable performance (delta=-0.006). Outcome permutation confirmed genuine signal (AUC 0.859 to 0.498 with shuffled mortality). Sensitivity analysis demonstrated near-zero variation (delta 0.0006-0.003). The framework performed well for 8 of 16 syndromes (AUC >0.70) and failed for DKA and post-cardiac surgery (AUC <0.40). Conclusions: Therapeutic Distance provides therapy-specific risk stratification that exceeds both established severity scores and standard machine learning while remaining robust to hyperparameter choices, temporal drift, and outcome permutation.
Yamamoto, R.; Wu, F.; Sprehe, L. K.; Abeer, A.; Celi, L. A.; Tohyama, T.
Show abstract
Clinical prediction models for sepsis frequently degrade when applied outside the development setting. Electronic health record data encode not only patient physiology but also observation processes such as measurement timing and frequency, which may be predictive within a site but unstable across sites. The contribution of these observation-process features to cross-site performance degradation has not been quantified. In this retrospective cohort study, we developed models for in-hospital mortality in adult intensive care unit (ICU) patients meeting Sepsis-3 criteria using Medical Information Mart for Intensive Care IV (MIMIC-IV) (n = 30,218; 16.3% mortality) and externally validated them in eICU Collaborative Research Database (eICU-CRD) (n = 31,403; 13.9% mortality). We compared seven prespecified model specifications representing physiologic summary strategies (a single aggregate severity score, most recent values, extreme values, and within-window variability), each evaluated with and without measurement counts as observation-process features. Models were fit using logistic regression and gradient-boosted trees. Internally, discrimination improved with more detailed physiologic summaries and measurement counts (logistic regression area under the receiver operating characteristic curve [AUROC] from 0.819 to 0.834). In external validation, performance drops were larger for specifications using more complex physiologic representations. Adding measurement counts was associated with larger domain shift (AUROC change, -0.047 versus -0.082 with counts in logistic regression). External calibration deteriorated progressively, with calibration slopes decreasing from 1.007 for the simplest model to 0.417 for the most complex specification in logistic regression. Gradient-boosted trees showed smaller incremental degradation from measurement counts but still exhibited domain shift in complex specifications. Inclusion of observation-process features in sepsis mortality prediction models was associated with improved internal discrimination but worse external calibration and transportability. These findings highlight that feature engineering decisions involve a tradeoff between internal performance and external generalizability, and that calibration assessment provides the most sensitive indicator of reduced transportability.
Jourdan, O.; Duchiron, M.; Torrent, J.; Turpinat, C.; Mondesert, E.; Busto, G.; Morchikh, M.; Dornadic, M.; Delaby, C.; Hirtz, C.; Thizy, L.; Barnier-Figue, G.; Perrein, F.; Jurici, S.; Gabelle, A.; Bennys, K.; Lehmann, S.
Show abstract
Objectives: To evaluate the diagnostic performance of the -synuclein seed amplification assay (SAA) and characterize the impact of -synuclein co-pathology on cognitive and biological profiles in routine clinical practice. Methods: We included 398 patients from the prospective multicenter ALZAN cohort recruited from memory clinics in Montpellier, Nimes, and Perpignan. All participants underwent CSF and blood sampling with measurement of CSF biomarkers (A{beta}42/40, tau, ptau181) and plasma biomarkers (A{beta}42/40, ptau181, ptau217, GFAP, NfL). Cognitive assessment was performed using the Mini-Mental State Examination (MMSE). Clinical diagnoses were independently confirmed by two senior neurologists. Syn status was determined by SAA (RT-QuIC). Results: Of 398 patients, 19 out of 20 patients with Lewy body dementia (LBD) (95.0%) and 32 out of 203 patients with AD (15.8%) were SAA+. SAA-positivity presented a sensitivity of 95% and a specificity of 93.5% for distinguishing LBD from patients without LBD or AD. In the entire cohort, SAA+ patients showed lower MMSE scores (p<0.01), lower CSF A{beta}42/40 ratio (p<0.01), and elevated plasma GFAP (p<0.05). Within the AD group, no significant differences in CSF or blood biomarkers were observed between SAA+ and SAA- patients. Within the AD subgroup, no significant differences in CSF or blood biomarkers were observed between SAA+ and SAA- patients, except for a lower CSF A{beta}42/40 ratio in SAA+ patients (p<0.01). Interpretation: SAA demonstrates good diagnostic capabilities for detecting LBD and confirms notable Syn co-pathology in AD. This study highlights the limitations of routine CSF and emerging blood biomarkers in capturing Syn pathology and the value of integrating SAA into routine neurodegenerative disease assessment.
Adegbosin, O. T.; Patel, H.
Show abstract
BackgroundMicrosatellite stability status determination is important for prognostication and therapeutic decision making in colorectal cancer management, but the conventional methods for this assessment are not readily available, especially in low- and middle-income countries. Deep learning (DL) models have been proposed for addressing this problem; however, potential computational cost due to model complexity and inadequate explainability may limit their adoption in low-resource settings. This study explored the potential of explainable lightweight models for detection of microsatellite instability in colorectal cancer. MethodsDL models were trained using a public dataset of colorectal cancer histology images and then used to classify a set of test images into one of two classes: microsatellite instability or microsatellite stability. The models were compared for efficiency. Gradient-weighted class activation mapping (Grad-CAM) was used to interpret the models decision making. ResultsThe simpler convolutional neural network (CNN) trained from scratch had modest performance (accuracy=0.757, area under receiver-operating characteristic curve [AUROC]=0.840). With an attention mechanism added, these values increased, but specificity and sensitivity reduced. Pretrained models performed better than the ones trained from scratch, and EfficientNet_B0 had the best balance of high performance and low computational requirements (accuracy=0.936, AUROC=0.990, negative predictive value=0.923, specificity=0.953, 4,010,000 trainable parameters, 0.38 gigaFLOPs). However, a simple CNN model with attention mechanism had the best interpretability based on Grad-CAM. ConclusionThis study demonstrated that DL models that are lightweight when compared to previously proposed ones can be useful for colorectal cancer microsatellite instability screening in resource-limited settings while balancing performance and computational efficiency.